MISTRAL+: dedicated tool for under-resourced languages analysis

نویسندگان

  • Benoît Weber
  • Geneviève Caelen-Haumont
  • Do Dat Tran
  • Binh Hai Pham
چکیده

This paper presents MISTRAL+, a dedicated tool for the study of under-resourced languages. MISTRAL+ is the upgrated version of an automatic tool created in 2004 called MELISM. The entire process has been modified in order to simplify and enhance the study of under-resourced languages. MISTRAL+ is composed of two separated modules: MISTRAL_Praat a plugin integrated to the tool PRAAT, and MISTRAL_xls a VBA module. MISTRAL_Praat enables the creation of an approximation of the signal that is studied, it performs an automatic tonal annotation and exports all data in a xls standard file. Using MISTRAL_xls, the user is able to easily and quickly extract from the data generated by MISTRAL_praat the information he needs for his study. In the first part, MISTRAL+ and its main functionalities will be presented. In the second part, a closer look will be put on MISTRAL_praat. The third part will describe the second module MISTRAL_xls. The last part will present the study done using MISTRAL+ on the study of the Mio Pu, a vietnamese under-resourced language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MISTRAL+: A Melody Intonation Speaker Tonal Range semi-automatic Analysis using variable Levels

This paper presents MISTRAL+, the upgraded version of an automatic tool created in 2004 named INTSMEL then MELISM. Since MELISM, the entire process has been modified in order to simplify and enhance the study of languages. MISTRAL+ is a combinaison of two modules: a Praat plugin MISTRAL_Praat, and MISTRAL_xls. For specific corpora, it performs phonological annotation based on the F0 variation i...

متن کامل

Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages

Current approaches for word sense disambiguation and translation selection typically require lexical resources or large bilingual corpora with rich information fields and annotations, which are often infeasible for under-resourced languages. We extract translation context knowledge from a bilingual comparable corpora of a richer-resourced language pair, and inject it into a multilingual lexicon...

متن کامل

Towards the tonal system of an unknown language from south-east Asia: a deeper insight

This paper is focused on melodic and tonal analyses of a language without a writing system, the Mo Piu one, from an endangered ethnic minority of the south-east Asia in North Vietnam. The Mo Piu language is a branch still unknown of the Hmong-Mien family. Based on a previous experience, we try to get a deeper insight into the tonal system of this language, getting support with a dedicated tool,...

متن کامل

The development of new corpora for under-resourced languages using data available for well-resourced ones

In the paper we propose to exploit existing corpora of wellresourced languages as a basis for developing similar corpora of under-resourced ones. The construction of this type of corpora will allow finding common patterns of acoustic manifestation of similar functional states regardless of the language. The analysis of these corpora will also allow investigating universal and language-specific ...

متن کامل

Automatic Speech Recognition for Under-Resourced Languages:

Speech processing for under-resourced languages is an active field of research, which has experienced significant progress during the past decade. We propose, in this paper, a survey that focuses on automatic speech recognition (ASR) for these languages. The definition of under-resourced languages and the challenges associated to them are first defined. The main part of the paper is a literatur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012